Search CORE

28 research outputs found

What is the difference between the breakpoint graph and the de Bruijn graph?

Author: Lin Yu
Nurk Sergey
Pevzner Pavel A.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 29/11/2018
Field of study

The breakpoint graph and the de Bruijn graph are two key data structures in the studies of genome rearrangements and genome assembly. However, the classical breakpoint graphs are defined on two genomes (represented as sequences of synteny blocks), while the classical de Bruijn graphs are defined on a single genome (represented as DNA strings). Thus, the connection between these two graph models is not explicit. We generalize the notions of both the breakpoint graph and the de Bruijn graph, and make it transparent that the breakpoint graph and the de Bruijn graph are mathematically equivalent. The explicit description of the connection between these important data structures provides a bridge between two previously separated bioinformatics communities studying genome rearrangements and genome assembly

The Australian National University

Recommended from our members

Optimizing sequencing protocols for leaderboard metagenomics by combining long and short reads.

Author: Arthur Timothy D
Bankevich Anton
Boland Brigid S
Brennan Caitriona
Chang John T
Chen Feng
Conrad Douglas J
Dang Jason W
Dorrestein Pieter C
Fedarko Marcus
Gaffney James
Green Cliff
Humphrey Greg C
Jepsen Kristen
Khosroheidari Mahdieh
Knight Rob
Liyanage Marlon
Martino Cameron
Minich Jeremiah
Nurk Sergey
Pevzner Pavel A
Phelan Vanessa V
Quinn Robert A
Rana Tariq M
Salido Rodolfo A
Sandborn William J
Sanders Jon G
Sanders Karenina
Smarr Larry
Xu Zhenjiang Z
Zhu Qiyun
Publication venue: eScholarship, University of California
Publication date: 01/10/2019
Field of study

As metagenomic studies move to increasing numbers of samples, communities like the human gut may benefit more from the assembly of abundant microbes in many samples, rather than the exhaustive assembly of fewer samples. We term this approach leaderboard metagenome sequencing. To explore protocol optimization for leaderboard metagenomics in real samples, we introduce a benchmark of library prep and sequencing using internal references generated by synthetic long-read technology, allowing us to evaluate high-throughput library preparation methods against gold-standard reference genomes derived from the samples themselves. We introduce a low-cost protocol for high-throughput library preparation and sequencing

eScholarship - University of California

Metagenomics-Based, Strain-Level Analysis of Escherichia coli From a Time-Series of Microbiome Samples From a Crohn's Disease Patient

Author: Akseshina Margarita
Allen-Vercoe Emma
Beck Paul L.
Fang Xin
Gemmell Christopher
Gianetto-Hill Connor
Gray-Owen Scott D.
Knight Rob
Leung Nelly
Li Weizhong
Monk Jonathan M.
Nurk Sergey
Palsson Bernhard O.
Sandborn William J.
Sanders Jon
Smarr Larry
Szubin Richard
Zhu Qiyun
Publication venue: 'Frontiers Media SA'
Publication date: 01/01/2018
Field of study

Dysbiosis of the gut microbiome, including elevated abundance of putative leading bacterial triggers such as E. coli in inflammatory bowel disease (IBD) patients, is of great interest. To date, most E. coli studies in IBD patients are focused on clinical isolates, overlooking their relative abundances and turnover over time. Metagenomics-based studies, on the other hand, are less focused on strain-level investigations. Here, using recently developed bioinformatic tools, we analyzed the abundance and properties of specific E. coli strains in a Crohns disease (CD) patient longitudinally, while also considering the composition of the entire community over time. In this report, we conducted a pilot study on metagenomic-based, strain-level analysis of a time-series of E. coli strains in a left-sided CD patient, who exhibited sustained levels of E. coli greater than 100X healthy controls. We: (1) mapped out the composition of the gut microbiome over time, particularly the presence of E. coli strains, and found that the abundance and dominance of specific E. coli strains in the community varied over time; (2) performed strain-level de novo assemblies of seven dominant E. coli strains, and illustrated disparity between these strains in both phylogenetic origin and genomic content; (3) observed that strain ST1 (recovered during peak inflammation) is highly similar to known pathogenic AIEC strains NC101 and LF82 in both virulence factors and metabolic functions, while other strains (ST2-ST7) that were collected during more stable states displayed diverse characteristics; (4) isolated, sequenced, experimentally characterized ST1, and confirmed the accuracy of the de novo assembly; and (5) assessed growth capability of ST1 with a newly reconstructed genome-scale metabolic model of the strain, and showed its potential to use substrates found abundantly in the human gut to outcompete other microbes. In conclusion, inflammation status (assessed by the blood C-reactive protein and stool calprotectin) is likely correlated with the abundance of a subgroup of E. coli strains with specific traits. Therefore, strain-level time-series analysis of dominant E. coli strains in a CD patient is highly informative, and motivates a study of a larger cohort of IBD patients.</p

Frontiers - Publisher Connector

eScholarship - University of California

Online Research Database In Technology

FigShare

Metagenomics-Based, Strain-Level Analysis of Escherichia coli From a Time-Series of Microbiome Samples From a Crohn's Disease Patient

Author: Bernhard O. Palsson
Bernhard O. Palsson
Bernhard O. Palsson
Bernhard O. Palsson
Christopher Gemmell
Connor Gianetto-Hill
Emma Allen-Vercoe
Jon Sanders
Jonathan M. Monk
Larry Smarr
Larry Smarr
Larry Smarr
Margarita Akseshina
Nelly Leung
Paul L. Beck
Qiyun Zhu
Richard Szubin
Rob Knight
Rob Knight
Rob Knight
Scott D. Gray-Owen
Sergey Nurk
Weizhong Li
Weizhong Li
William J. Sandborn
William J. Sandborn
Xin Fang
Publication venue: 'Frontiers Media SA'
Publication date: 01/10/2018
Field of study

Dysbiosis of the gut microbiome, including elevated abundance of putative leading bacterial triggers such as E. coli in inflammatory bowel disease (IBD) patients, is of great interest. To date, most E. coli studies in IBD patients are focused on clinical isolates, overlooking their relative abundances and turnover over time. Metagenomics-based studies, on the other hand, are less focused on strain-level investigations. Here, using recently developed bioinformatic tools, we analyzed the abundance and properties of specific E. coli strains in a Crohns disease (CD) patient longitudinally, while also considering the composition of the entire community over time. In this report, we conducted a pilot study on metagenomic-based, strain-level analysis of a time-series of E. coli strains in a left-sided CD patient, who exhibited sustained levels of E. coli greater than 100X healthy controls. We: (1) mapped out the composition of the gut microbiome over time, particularly the presence of E. coli strains, and found that the abundance and dominance of specific E. coli strains in the community varied over time; (2) performed strain-level de novo assemblies of seven dominant E. coli strains, and illustrated disparity between these strains in both phylogenetic origin and genomic content; (3) observed that strain ST1 (recovered during peak inflammation) is highly similar to known pathogenic AIEC strains NC101 and LF82 in both virulence factors and metabolic functions, while other strains (ST2-ST7) that were collected during more stable states displayed diverse characteristics; (4) isolated, sequenced, experimentally characterized ST1, and confirmed the accuracy of the de novo assembly; and (5) assessed growth capability of ST1 with a newly reconstructed genome-scale metabolic model of the strain, and showed its potential to use substrates found abundantly in the human gut to outcompete other microbes. In conclusion, inflammation status (assessed by the blood C-reactive protein and stool calprotectin) is likely correlated with the abundance of a subgroup of E. coli strains with specific traits. Therefore, strain-level time-series analysis of dominant E. coli strains in a CD patient is highly informative, and motivates a study of a larger cohort of IBD patients

Directory of Open Access Journals

Assignment of virus and antimicrobial resistance genes to microbial hosts in a complex microbial community by combined long-read assembly and proximity ligation

Author: Adam M. Phillippy
AG Brownlee
AV Zimin
B Buchfink
BD Ondov
BL Brown
Bradd J. Haley
C Urbaniak
Cheryl Heiner
CL Anderson
CMK Sieber
Curtis P. Van Tassell
D Hyatt
D Li
DD Kang
DE Fouts
Derek M. Bickhart
DH Parks
DM Bickhart
DM Bickhart
DM Stevenson
DR Bentley
DR Laetsch
E Zankari
G Henderson
G Yu
Garret Suen
H Li
H Li
Ivan Liachko
J Eid
J Huerta-Cepas
J Wagner
JA Frank
Jay Ghurye
JN Burton
Jo Ann S. Van Kessel
KA Dill-McFarland
KA Jewell
Kevin Panke-Buisse
Kiranmayee Bakshy
Laura M. Cersosimo
LM Solden
M Hess
M Loose
M Pendleton
Martial Marbouty
Maximilian O. Press
MD Auffret
Mick Watson
Mihai Pop
MO Press
N Nagarajan
NB Shoemaker
P Shannon
Paul J. Weimer
Phillip R. Myer
PJ Weimer
R Mohammed
RD Stewart
RM Bowers
S Awad
S Conlan
S Louca
S Nurk
Seon Woo Kim
Sergey Koren
Shawn T. Sullivan
SM Karst
SS Paul
Timothy P. L. Smith
W Zhou
X-Q Li
Y-C Tsai
Z Yu
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 02/08/2019
Field of study

Crossref

Edinburgh Research Explorer

The complete sequence of a human genome.

Author: Nurk Sergey,
Publication venue
Publication date: 22/07/2022
Field of study

Ezid

The complete sequence of a human genome [preprint]

Author: Eichler Evan E.
Miga Karen H.
Nurk Sergey
Phillippy Adam M.
Rogaev Evgeny I.
Publication venue: eScholarship@UMassChan
Publication date: 27/05/2021
Field of study

In 2001, Celera Genomics and the International Human Genome Sequencing Consortium published their initial drafts of the human genome, which revolutionized the field of genomics. While these drafts and the updates that followed effectively covered the euchromatic fraction of the genome, the heterochromatin and many other complex regions were left unfinished or erroneous. Addressing this remaining 8% of the genome, the Telomere-to-Telomere (T2T) Consortium has finished the first truly complete 3.055 billion base pair (bp) sequence of a human genome, representing the largest improvement to the human reference genome since its initial release. The new T2T-CHM13 reference includes gapless assemblies for all 22 autosomes plus Chromosome X, corrects numerous errors, and introduces nearly 200 million bp of novel sequence containing 2,226 paralogous gene copies, 115 of which are predicted to be protein coding. The newly completed regions include all centromeric satellite arrays and the short arms of all five acrocentric chromosomes, unlocking these complex regions of the genome to variational and functional studies for the first time

eScholarship@UMMS

Model-driven discovery of underground metabolic functions in Escherichia coli

Author: Brunk Elizabeth
Ebrahim Ali
Feist Adam M.
Guzmán Gabriela I.
Monk Jonathan M.
Nurk Sergey
Palsson Bernhard O.
Utrilla José
Publication venue: 'Proceedings of the National Academy of Sciences'
Publication date: 01/01/2015
Field of study

Enzyme promiscuity toward substrates has been discussed in evolutionary terms as providing the flexibility to adapt to novel environments. In the present work, we describe an approach toward exploring such enzyme promiscuity in the space of a metabolic network. This approach leverages genome-scale models, which have been widely used for predicting growth phenotypes in various environments or following a genetic perturbation; however, these predictions occasionally fail. Failed predictions of gene essentiality offer an opportunity for targeting biological discovery, suggesting the presence of unknown underground pathways stemming from enzymatic cross-reactivity. We demonstrate a workflow that couples constraint-based modeling and bioinformatic tools with KO strain analysis and adaptive laboratory evolution for the purpose of predicting promiscuity at the genome scale. Three cases of genes that are incorrectly predicted as essential in Escherichia coli—aspC, argD, and gltA—are examined, and isozyme functions are uncovered for each to a different extent. Seven isozyme functions based on genetic and transcriptional evidence are suggested between the genes aspC and tyrB, argD and astC, gabT and puuE, and gltA and prpC. This study demonstrates how a targeted model-driven approach to discovery can systematically fill knowledge gaps, characterize underground metabolism, and elucidate regulatory mechanisms of adaptation in response to gene KO perturbations

PubMed Central

Online Research Database In Technology

Recommended from our members

Metagenomics-Based, Strain-Level Analysis of Escherichia coli From a Time-Series of Microbiome Samples From a Crohn's Disease Patient.

Author: Akseshina Margarita
Allen-Vercoe Emma
Beck Paul L
Fang Xin
Gemmell Christopher
Gianetto-Hill Connor
Gray-Owen Scott D
Knight Rob
Leung Nelly
Li Weizhong
Monk Jonathan M
Nurk Sergey
Palsson Bernhard O
Sandborn William J
Sanders Jon
Smarr Larry
Szubin Richard
Zhu Qiyun
Publication venue: eScholarship, University of California
Publication date: 01/01/2018
Field of study

eScholarship - University of California

What is the difference between the breakpoint graph and the de Bruijn graph?

Optimizing sequencing protocols for leaderboard metagenomics by combining long and short reads.

Metagenomics-Based, Strain-Level Analysis of <i>Escherichia coli</i> From a Time-Series of Microbiome Samples From a Crohn's Disease Patient

Metagenomics-Based, Strain-Level Analysis of Escherichia coli From a Time-Series of Microbiome Samples From a Crohn's Disease Patient

Assignment of virus and antimicrobial resistance genes to microbial hosts in a complex microbial community by combined long-read assembly and proximity ligation

The complete sequence of a human genome.

The complete sequence of a human genome [preprint]

Model-driven discovery of underground metabolic functions in <i>Escherichia coli</i>

Metagenomics-Based, Strain-Level Analysis of Escherichia coli From a Time-Series of Microbiome Samples From a Crohn's Disease Patient.